)
q
g
oV-2 sequences (greater than 0.02). Figure 7.13 shows the partial
multiple sequence alignment using msa, where ten SARS-CoV-
ces were exactly clustered together.
Fig. 7.13. The msa result for 17 genome sequences.
he alignment-based approach versus the alignment-free
h for sequence comparison
gnment-free multiple sequence comparison approach can support
e genome pattern discovery, it is interesting to compare the
t-free approach with the alignment-based approach for three
.e., the speed, the accuracy and the pattern discovery power.
he speed comparison
the alignment-free sequence comparison approach is saving the
e significantly was examined based on a 3-mer word data set.
udo nucleotide sequences were generated. Their lengths were
om 1,000 to 10,000. The mutation rate on one sequence was 10%.
edleman-Wunsch algorithm was applied for the homology
t between them, i.e., the alignment-based sequence comparison.
er word frequency was calculated for each sequence. The pair-
ances between sequences were calculated based on the word
y vectors. The CPU time was recorded. Figure 7.14(a) shows the
on. It can be seen that the CPU time of the alignment-free